Speech Recognition Only with Supra - segmental Features — Hearing Speech as Music —

نویسندگان

Nobuaki MINEMATSU

Tazuko NISHIMURA

Takao MURAKAMI

Keikichi HIROSE

چکیده

This paper proposes a novel paradigm of speech recognition where only the supra-segmental features are utilized. Absolute properties of speech events such as formants and spectrums are completely discarded and only the relative and differential properties of the events are extracted as phonic contrasts. The phonic contrasts are considered as supra-segmental features and they are mathematically shown not to carry non-linguistic features such as speaker, age, gender, etc. This fact leads us to expect that speaker-independent speech recognition should be possible with the reference models built only with a single speaker’s speech. Experiments of isolated vowel sequence recognition show that this expectation is correct and that the performance of the new paradigm is better than that of the conventional one using more than four thousand speakers, even in the case of noisy speech. Hearing sounds through capturing only their contrasts and their structure is often done when hearing musical sounds, indicating that the proposed paradigm hears speech as music.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Eﬀect of Using PRAAT Software on Pre-Intermediate EFL Learners’ Supra Segmental Features

The present study investigated the eﬀect of using PRAAT as a free computer software package for the scientific analysis of speech in phonetics on pre-intermediate Iranian English as foreign language (EFL) learners’ supra segmental features (i.e., intonation and stress). The design of the study was a Quasi-experimental research design with a pre and post-test. In doing so...

متن کامل

Analysis and Modelling of Emotional Speech in Spanish

The importance of speech prosody for conveying emotional information has been extensively underlined in the literature. Major elements such as pitch, tempo and stress are presented as the main acoustic correlates of emotion in human speech. Nevertheless, as several authors have shown, voice quality is also a relevant feature in emotion recognition. In this paper, we present the prosodic analysi...

متن کامل

Syllabic Pitch Tuning for Neutral-to-emotional Voice Conversion

Prosody plays an important role in neutral-to-emotional voice conversion. Prosodic features like pitch are usually estimated and altered at a segmental level based on short windowing of speech signal (where the signal is expected to be quasi-stationary). This results in a frame-wise change of acoustical parameters for synthesizing emotionalized speech. In order to convert a neutral speech to an...

متن کامل

Music Training Program: A Method Based on Language Development and Principles of Neuroscience to Optimize Speech and Language Skills in Hearing-Impaired Children

Introduction: In recent years, music has been employed in many intervention and rehabilitation program to enhance cognitive abilities in patients. Numerous researches show that music therapy can help improving language skills in patients including hearing impaired. In this study, a new method of music training is introduced based on principles of neuroscience and capabilities of Persian languag...

متن کامل

Automatic Segmentation of Continuous Speech on Word Level Based on Supra-segmental Features

This article presents a cross-lingual study for Hungarian and Finnish about the segmentation of continuous speech on word and phrasal level by examination of supra-segmental parameters. A word level segmentationer has been developed which can indicate the word boundaries with acceptable precision for both languages. The ultimate aim is to increase the robustness of speech recognition on the lan...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

Speech Recognition Only with Supra - segmental Features — Hearing Speech as Music —

نویسندگان

چکیده

منابع مشابه

The Eﬀect of Using PRAAT Software on Pre-Intermediate EFL Learners’ Supra Segmental Features

Analysis and Modelling of Emotional Speech in Spanish

Syllabic Pitch Tuning for Neutral-to-emotional Voice Conversion

Music Training Program: A Method Based on Language Development and Principles of Neuroscience to Optimize Speech and Language Skills in Hearing-Impaired Children

Automatic Segmentation of Continuous Speech on Word Level Based on Supra-segmental Features

عنوان ژورنال:

اشتراک گذاری